Conversation
nltk_contrib/timex3.py
Outdated
| 'Monday': 0, | ||
| 'Tuesday': 1, | ||
| 'Wednesday': 2, | ||
| 'Thursday': 3, | ||
| 'Friday': 4, | ||
| 'Saturday': 5, | ||
| 'Sunday': 6} |
There was a problem hiding this comment.
These capitalized day and month names were not working in every scenario for me. By converting to lowercase for input text these issues were resolved.
| month = "(january|february|march|april|may|june|july|august|september| \ | ||
| october|november|december)" | ||
|
|
There was a problem hiding this comment.
There was a bug in the original version here where the global variable month was overwritten and then unusable on subsequent calls to timex. Adding month in here solves the issue.
| timex_found = timex_regex.findall(tagged_text) | ||
| timex_found = map(lambda timex:re.sub(r'</?TIMEX2.*?>', '', timex), \ | ||
| timex_found) | ||
| timexList = [] |
There was a problem hiding this comment.
This new variable is used to return timex values as a list in addition to the timex tagged format.
| import string | ||
| import os | ||
| import sys | ||
| from datetime import datetime, timedelta |
There was a problem hiding this comment.
The new timedelta features of python3 allow us to remove the dependency on mx.DateTime
|
Pleas see the second commit for line-by-line changes. There are a few things fixed here other than the simple conversion that should be reviewed. Thanks for looking in advance! |
|
Appreciated ! 👍 |
Created a new version of timex that is in python 3 and kept the original version with the original file name. The new version uses python3 datetime "timedelta" features so it no longer relies on the eGenix datetime distribution that timex depends on in order to function. All relative time expressions are working in the project I am using it in.